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exchange Monte Carlo methods. We find qualitative correspondence in the results, but 
serious quantitative differences using the Nose-Hoover chain thermostat. For analyzing 
the deviations, we study different parameterizations of the Nose-Hoover chain thermostat. 
Autocorrelations from molecular dynamics and Metropolis Monte Carlo runs arc also in- 
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1 Introduction 

Understanding protein folding is one of the most complex challenges of science. The main 
reason is the characteristic property common to all proteins that the individual biolog- 
ical function strongly correlates with the three-dimensional geometry of the native fold. 
Proteins are functional ingredients in all biological systems and misfolding, mutational, or 
nonfunctional aggregation typically entail influential disturbances in biological networks 
which frequently lead to still incurable diseases. For this reason, there is an enormously 
growing interdisciplinary interest in understanding the often spontaneous structure forma- 
tion of proteins, which generally depends on the linear sequence of amino acids building 
up the macromolecule. Most of the proteins consist of hundreds of amino acids and, 
therefore, thousands of atoms. The folding process is guided by usual noncovalent phys- 
ical and chemical interactions, such as Coulomb forces due to residual charges, hydrogen 
bonding, submolecular van der Waals potentials, energetic torsional barriers, and the effec- 
tive hydrophobic force, which results as an effective interaction with the aqueous solvent 
surrounding most of the non-membrane proteins. 

Beside the extensive bioanalytical efforts in the past few decades to resolve by X-ray and 
NMR techniques native folds of tens of thousands of bioproteins, computational simulation 
methods have become very popular tools for studying experimentally hardly manageable 
dynamical and thermodynamic properties accompanying the folding kinetics. In particular, 
this incorporates the central question of the cooperative conformational transitions the 
protein experiences in the formation of secondary structures (e.g., helices, sheets) and 
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tertiary folds (e.g., hydrophobic-core domains). 

In computational chemistry and physics, three main simulation techniques are used, which 
primarily focus on different aspects. Dynamical quantities of a process are frequently 
addressed by means of molecular dynamics (MD) methods [1, 2], where the phase-space 
trajectory of a system is numerically calculated employing a discrete, but symplectic in- 
tegration scheme for the system's equations of motion. One of the essential properties of 
Hamiltonian dynamics in continuum is that the phase-space density is constant in time. 
Thus, a necessary condition for a discrete symplectic integration is that the Jacobian of 
the transformation from the system's state at time t to the state at time i -|- At is unity. 
Conversely, Monte Carlo (MC) simulations are typically used to reveal the thermodynamic 
equilibrium properties of a system by sampling system conformations in the thermally rele- 
vant region of the phase space [3, 4]. Since configurational updates in MC are in fact based 
on a Markov process, MC also possesses a kind of pseudo-dynamics. The third frequently 
used method is based on the density functional and therefore provides a tool for ab-initio 
quantum-chemical calculations of electronic structures of many-particle systems [5] . 
Prom a statistical physics point of view, conventional MD samples a microcanonical ensem- 
ble [6], as the system energy is kept constant. However, thermal fluctuations in the physio- 
logical temperature interval are of essential importance for biological processes. Therefore, 
in more realistic simulations, the energy should fluctuate according to a certain, e.g., 
Boltzmannian distribution. In consequence, the system's degrees of freedom have to be 
coupled to an environment, e.g., a heat-bath to provide constant temperature. In our 
MD simulations, we used the stochastic Andersen thermostat [2, 7] and the deterministic 
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Nose- Hoover chain (NHC) thermostat [2, 8, 9, 10]. It was shown for polymer systems that 
ensemble averages of dynamic properties are unaffected by coupling Newtonian dynamics 
with Nose-Hoover thermostats [11, 12]. Thermostats were introduced to ensure that the 
system trajectory in principle samples the phase space according to the Boltzmann distri- 
bution. This means that canonical statistics should be exactly satisfied in an infinitely long 
thermostated MD run. In practice, however, the question is what is a sufficiently long run 
to accumulate reliable statistics. By definition, MD generates an extremely local "update" 
within a single time step, whereas MC sweeps provide larger jumps in the phase space. 
Therefore, it is to be expected that correlations decay much slower in MD, compared to 
MC. 

In this paper, we aim at a comparative study of thermostated MD [2, 7, 8, 9, 10] and 
replica-exchange MC [13]. To this end, we consider a simplified coarse-grained bead- 
spring heteropolymer model and concentrate on thermodynamic properties. The paper 
is organized as follows. In Sec. 2, the AB model, modified by flexible virtual bonds, 
is revisited. Furthermore, we summarize the essentials of the Andersen and the NHC 
thermostat used in our constant-temperature MD simulations, and the replica-exchange 
generalized-ensemble MC method which is used for generating reference data. The results 
obtained with these methods are discussed and the deviations are analysed in Sec. 3. The 
paper is concluded by a summary in Sec. 4. 
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2 Model and methods 

In this section, we introduce the model employed in our comparative study and summarize 
the details of the methods used. 

2.1 Hydrophobic-polar heteropolymer with flexible covalent bonds 

Our study is based on a simple coarse-grained heteropolymer model, which is strongly 
related to the AB model [14], where only hydrophobic (^4) and hydrophilic (B) monomers 
are distinguished. In the original AB model, covalent bonds between adjacent monomers 
have a fixed length. This corresponds to the known stiffness of these virtual bonds [15]. 
Here, we relax this constraint through replacing the stiff bond by a harmonic spring. The 
reason is of rather technical nature as it drastically simplifies the MD implementation 
using Cartesian coordinates. MD simulations of polymers with stiff bonds are in principle 
possible, e.g., by using the standard Shake [16] or Rattle [17] algorithms. 
Denoting the set of coordinates for the N monomers by R = {ri, . . . , r^}, we define our 
bead-spring variant of the AB model with £o being an overall energy scale as 

V{R) - £o[^^bend(R) + ^^Lj(R) + ^^harm(R)], (1) 

where 

^ N-2 
* fc=l 

is the bending energy and the sum runs over the {N — 2) bending angles of successive 
bond vectors. The monomer-type dependent intramolecular potential between nonbonded 
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monomers i and j with distance rij — jr, — r^j is of Lennard- Jones form: 

«u = 4E E (;^-^). (3) 

i=l j=i+2 \' ij ' ij J 

The monomer-type dependence of this contribution is expressed by the parameter C: 

+ 1, (Ti, Oj — A, 

C{'Ji,<Jj) = \ +1/2, ai,aj^B, (4) 
-1/2, ai 7^ aj. 

The third term in Eq. (1) is the harmonic-spring extension of the AB model and reads: 

N-l 

^^harm = « X] ('^"+1 " ^o)^ ■ (5) 
1=1 

The sum is taken over the N — 1 bonds and therefore the spring energy is related to 
the square deviation of the bond length from the minimum-potential distance bo, which 
sets the characteristic length scale. The parameter a controls the bond strength and in 
the strong-coupling limit a ^ oo, the fixed-bond behavior of the original AB model is 
approached. 

The kinetic energy is £'kin(P) = P^P/2m, where P = {pi, . . . , p;v} is the set of the N 
monomer momentum vectors. Independent of their type, all monomers shall have the same 
mass m of the order of the average mass of an amino acid. The Hamiltonian 

7i(P,R) = Ekin(P) + V^(R) (6) 

is equivalent to the total energy of the system and constant in time, ?i(P, R) = E — 
const. Throughout the heteropolymer simulations, natural units are employed, in which 



In this comparative study, we concentrate ourselves on the exemphfied AB heteropolymer 
sequence SI: BA2BAiBABA2BA^B with 20 monomers [14 being hydrophobic (^4) and 6 
polar {B)]. The thermodynamic properties of the fixed-bond heteropolymer with this se- 
quence have been analysed in detail in Ref . [18] . The AB model with fixed bond lengths has 
also proven quite useful in a systematic characterisation of heteropolymer folding channels 
known from realistic proteins [19]. 

2.2 Molecular dynamics with Andersen and Nose-Hoover ther- 
mostat 

Conventional molecular dynamics is governed by Newton's equations of motion, which read 
in Hamiltonian form for the ith particle 

= Vp,?i(P,R) = ^ , (7) 
m 

P^ = -Vr,7i(P,R) = -Vr,V^(R). (8) 

System trajectories lie on a constant-energy surface, E = const. In order to conserve the 
energy in molecular dynamics simulations with discretized time steps, symplectic integra- 
tors are required, which provide the stability of the phase-space trajectory under time 
reversal. 

In this standard form of molecular dynamics, the states of the system form a microcanonical 
ensemble in an energy shell E — AE < E < E + AE with AE/E <^ 1. The temperature 
does not serve as an external control parameter and is defined as T = (dS{E)/dE)~^ via 
the microcanonical entropy S{E) — ks In Je^.^ dE'g{E'), where g{E) is the density of states 
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with energy E [20]. 

In many applications, however, it is desirable to adjust the system temperature T by a 
heat bath. Consequently, the total energies of system states follow the canonical Boltzmann 
distribution Pcan(-E') ~ exp(— E'/Zc^T'). Typically, the folding temperature of a peptide 
determines the characteristic energy scale. There are mainly two classes of approaches to 
introduce thermostats into Hamilton's equation of motion: by stochastic collision forces or 
via virtual deterministic extensions of the phase space. In our MD simulations, we have 
used the stochastic Andersen thermostat [7] and the deterministic Nose-Hoover chain [10]. 
Using the Andersen thermostat, a monomer experiences a random collision with a fictitious 
heat-bath particle after each time step St with probability PcoU = ^ St, where u is the colli- 
sion frequency. Therefore, for uncorrelated random forces, the probability for a collision at 
time t is Poissonian, Pcon{^,t) — exp{—i/t). In case of a collision, each component of the 
new momentum p = p\-'\ j = 1, 2, 3, of the selected monomer is drawn from the canonical 
Maxwell-Boltzmann distribution Pcan(p) — exp(— p^/2m/cBT)/-v^7rm^^T, whose width is 
determined by the temperature T. In Andersen dynamics, each monomer behaves like a 
Brownian particle under the influence of the external field induced by the other nonbonded 
monomers and the springs. After infinitely long time t, the phase-space trajectory, con- 
sisting of a set of nonconnected deterministic fragments, will have covered the complete 
accessible phase space, which is sampled according to the Boltzmann distribution Pcan(-E'). 
Nose-Hoover dynamics is more complex, as the phase space is extended by 2M additional 
degrees of freedom, and p^^, k — 1, . . . , M. These dynamical variables effectively rep- 
resent the coupling of the system to the heat bath. The idea is that the high-dimensional 
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phase space provides the particles with more flexibihty to leave the E — const, trajectory 
in the "true" (P, R) phase space in a completely deterministic "extended" dynamics. The 
Nose-Hoover energy 

M J2. M 

^NHC = ^(P, R) + E ^ + SiVfcBTCi + keT J2 ik, (9) 

fe=l fe=2 

which is conserved in the multi-dimensional phase space, H^nc = const., is defined in such 
a way that by integrating out the fluctuations of the additional degrees of freedom, the 
(P, R) states are distributed according to the canonical ensemble. However, the extended 
system is not Hamiltonian anymore and the derivation of the equations of motion requires 
some care. Extending the Hamiltonian equations of motion (7) and (8), the Nose- Hoover 
equations of motion read [2, 8]: 

r. = - , (10) 
m 

P, = -Vr,y(R) - gip. , (11) 

a = (12) 
P. = (gg-3^/^.T)-|p,, (13) 

Pa = ^ - ksT - ^P,,{1 - S,m) . (14) 

The numerical values of the virtual masses Qk of the coupling variables influence the 
dynamics but, in principle, not the statistical averages. For systems, where the total 
energy is the only conserved quantity in the dynamics, the choice of a single coupling 
coordinate M = 1 is sufficient. In order to destroy additional symmetries, however, M > 1 
couplings are required and their Nose-Hoover equations of motion form the linear Nose- 
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Hoover chain [2, 10]. A prominent exceptional example is the one-dimensional harmonic 
oscillator, where two coupling degrees of freedom are necessary. 

The numerical integration of the equations of motion in our simulations with Andersen and 
Nose- Hoover thermostat was performed with the standard Stormer-Verlet algorithm [2, 21]. 

2.2.1 Implementation details of the NHC thermostat 

In Nose-Hoover dynamics, the temporal propagation for a time step 5t of the system and 
the heat-bath coupling degrees of freedom is governed by the time evolution operator 
C^NHc(<^^), which can be decomposed into the Trotter factorized form 

^^^CpH ^^CnSt/2 ^iCc&t/2 ^ (^^^^3y (^5) 

In this expression, Cp, jCr, and jCq are the Liouville operators of the monomer momenta 
P, the monomer coordinates R, and the heat-bath coupling chain degrees of freedom, 
respectively. In higher-order integration schemes, the time step of heat-bath coupling 
propagation is divided further into ric equidistant steps. Thus, 
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iCcSt/2 — YlYl e''''C"''^''*/^"'= . (16) 
j=i k=i 

In our NHC-MD simulations, we mainly followed the procedure described in Ref . [22] . We 
applied a 5th order integration scheme (m = 3) and set ric — 1; i-e., the error is of order 
0{5t^). The Yoshida-Suzuki parameters Wk are wi^3 = 1/(2 - 2^/3)^ W2 = l- 2wi [23, 24]. 
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2.2.2 The choice of the virtual masses for the heat-bath-coupling degrees of 
freedom 

The masses Qk {k — 1, . . . , M) of the virtual heat-bath "particles" influence the coupling 
strength and, therefore, the dynamics of the correlations between the degrees of freedom 
of the heat-bath and the system. Large thermal inertia Qk cause a large time scale for 
the fluctuations of the heat-bath degrees of freedom. Depending on the fastest time scale 
of the system, the thermostat may not be capable in balancing these fluctuations. On 
the other hand, too small values of Qk induce high-frequency fluctuations into the system, 
which equilibrates then very slowly. 

For the one-dimensional harmonic oscillator Ti, — /2m -\- muPx^ /2 with m — 2 and 
u;^ = 1/2, Fig. 1 shows for different choices of Qi and Q2 the relative errors of the canonical 
position and momentum distributions as measured in NHC-MD simulations with M — 2 
Nose-Hoover thermostats, compared with the exact distributions. The data were obtained 
by performing 10^ time steps of width 6t = 0.01 at T = 5. The relative error of the 
measured histogram compared with the exact distribution is noticeably dependent on the 
values of Qi and Q2- The biggest errors are found for very small and very large Q- values, 
whereas fluctuations and response seem to be balanced much better for moderate choices 
of order Qi^2 ~ This quahtatively conflrms the suggestion in Ref. [10] to relate the 

QkS to the fastest time scales of the heat-bath (as induced by the thermal energy ~ ksT) 
and the system {t — l/co) by choosing 

Qk = fkhsTr^ (17) 
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where fi — D N {D is the spatial dimension) and fk>i — 1- As we can also infer from 
Fig. 1, our NHC-MD implementation works quite well for the one-dimensional harmonic 
oscillator with the properly adjusted virtual masses (Qi = Q2 = W in our units). 
For more complex systems, the identification of r is not obvious. It can be related, e.g., 
to the fastest fluctuations and thus the largest mode in the spectrum of autocorrelation 
functions. The normalized autocorrelation function of a time-dependent quantity s{t) is 
defined as 

_ {sit)sit + At))-{sit)y 

^ - — {sm - {s{t))- — ' ^^^^ 

where (. . .) is the temporal average over the time series, which in equilibrium is identical 
with the statistical ensemble average. 

In Fig. 2, the Fourier transforms, i.e., the frequency spectra, of the velocity autocorrelation 
function, Ay{u!), and of the bond-length autocorrelations, 74j..._^i(a;), are shown for the 
heteropolymer sequence SI. Assuming that the fiuctuations of the harmonic springs (5) 
are of shortest time scale, the bond strength a — mw1^^^/2 defines the time scale: 

Tbond = ^bond = \j (^^) 

The results for the autocorrelations obtained with an exemplified NHC-MD run at T = 1.0 
with M = 2 Nose-Hoover thermostats as shown in Fig. 2 justify this assumption. Both 
autocorrelation functions have a peak close to uj/uJhond — 1, which is the highest-frequency 
mode. Therefore, we use Tbond for adjusting the virtual masses in Eq. (17) in our NHC-MD 
heteropolymer simulations. 
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2.3 Replica-exchange Monte Carlo method 

For verifying the statistical results obtained with NHC-MD of the heteropolymer model, 
a comparison with exact results is not possible. Therefore, we use a standard Markov 
chain Monte Carlo algorithm as reference method. Since the dynamics of conventional 
Metropolis sampling at fixed temperature notoriously slows down close to temperatures, 
where conformational transitions occur and also, in the dense polymer limit, a generalized- 
ensemble method is more appropriate for a comprising study of thermodynamics. A sim- 
ple and efficient sampling scheme is provided by the replica-exchange or parallel tempering 
method [13] . In this method, threads of Markov chains at different, deliberately chosen tem- 
peratures run in parallel and frequent trials to exchange conformations between the threads 
ensure a reasonable sampling of the accessible conformational space. In most applications, 
MC methods serve to accumulate statistics by samphng regions of the configurational space 
that dominate the ensemble at a given temperature. Although sometimes also employed 
for kinetic studies, the MC dynamics of the system is typically of less importance. For most 
quantities of interest, the kinetic part in the Hamiltonian (6) does not influence statistical 
averages, i.e., the momentum fluctuations can be integrated out exactly. Therefore, in our 
MC simulations, the system experiences only coordinate updates. In a replica-exchange 
step, the actual conformation R with reciprocal thermal energy (3 — l/ksT is tried to be 
exchanged with the polymer conformation R' being currently present in a neighbor thread 
running at (5' — 1/kBT' . The acceptance probability for this exchange is simply given by 




(20) 
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where A = (/?' — /3)[y(R) — y(R')]. Hence, for A < 0, the exchange is always accepted. 
If A > 0, the exchange is accepted with the Boltzmann-hke probabihty e~^. A reason- 
able acceptance rate can only be achieved, if the canonical histograms of the system have 
sufficient overlap at the exchange temperatures T and T'. Therefore, the efficiency of the 
method strongly depends on the careful choice of the number of threads and the associ- 
ated temperatures. Conformational updates between the exchange trials within a thread 
at constant temperature are accepted according to the Metropolis transition probability. 
In fixed-bond simulations conformational changes were performed using spherical-cap up- 
dates [18]. For simulations of the spring model, we used simple Cartesian updates, where 
a monomer or a bond is moved. An enormous advantage of the method is that it can 
easily be parallelized as only the temperatures between the threads, running on individual 
processors, have to be communicated. 

3 Thermodynamics of SI: Comparison of results from 
MD and MC simulations 

In the following, we analyse the thermodynamic behavior of the fiexible-bond model for 
the heteropolymer sequence SI from results obtained with Andersen MD (A-MD), Nose- 
Hoover chain MD (NHC-MD), standard Metropolis MC (M-MC), and replica-exchange 
MC (RE-MC) simulations. The methods and their specifications in the simulations are 
fisted in Table I. 
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3.1 Energetic and conformational fluctuations 

Energetic fluctuations are expressed by the speciflc heat per monomer via Cy (T) = (("W^) — 
{7i)^)/NkBT'^- For Hamiltonian systems in three dimensions (6), this can be written as 

Cv{T) = C^- + {{ViRf) - {V{R)r) , (21) 

with the constant kinetic contribution Cy™ — SkB/'i- In our analyses, we consider only 
the potential energy contribution allowing for a direct comparison with results from MC 
simulations. 

Frequently used conformational quantities in polymer physics are the end-to-end distance 

i?ee(R) = kiv-ril (22) 

and the radius of gyration 

i?gyr(R) 



N ^ N 



^ ]Y - '^o)^ ro = — ^Fi, (23) 



1=1 



which is a measure for the compactness of the polymer. In particular, the fluctuations 

d 1 

^(-Ree.gyr) = ((-Ree,gyr-E) - (i?ee,gyr) (-E)) (24) 

are of interest, as divergences or extremal points in their temperature dependence are sig- 
nals for cooperative conformational activity, i.e., they indicate conformational transitions. 
It should be noted that for flnite-length systems, as heteropolymers deflnitely are, the fluc- 
tuations do not collapse at a certain transition temperature. Rather, different quantities 
signalize activity typically at different temperatures forming a transition region [25]. 
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3.2 Thermodynamic properties of SI 

As a first appfication, we discuss how the thermodynamic behavior of the heteropolymer 
with sequence SI depends on the flexibihty of the virtual covalent bonds. In Figs. 3(a)-(c), 
the specific heat as well as the fiuctuations of gyration radius and end-to-end distance are 
shown for different choices of the harmonic coupling strengths a — 5, 10, 50. These results 
were obtained from RE-MC simulations with high precision and serve as reliable reference 
data. For large coupling strengths, these data reproduce former results obtained in studies 
for heteropolymers with stiff bonds, where other sophisticated MC techniques were em- 
ployed [18, 19]. Error bars shown were calculated using the jackknife binning method [26]. 
In all plots, there is a peak around T 0.8. Although the maximum values depend on a, 
the peak temperature does not. In fact, these peaks are indications for the conformational 
transition between random-coil structures and globular conformations and is, therefore, 
seen in all fiuctuations. There is also a second region of activity at lower temperatures, 
with a comparatively weak signal in d{Rgyr) /dT. Actually, the heteropolymer does not 
experience a further collapse, but rather an energetically favorable rearrangement of the 
monomers. This is an indication for the typical heteropolymer-specific effect of the forma- 
tion of a compact hydrophobic core. The increasing strength of this signal for stiffer bonds 
can thus be explained with the larger barrier associated with the monomer rearrangement. 
This is not surprising as the fluctuation width of the springs decreases, i.e., the N — 1 
bond degrees of freedom are "frozen". This effect is maximal for fixed bonds [18], where 
the heteropolymer possesses only 3N — {N — 1) = 2N + 1 degrees of freedom. It should 
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be noted that, due to the different number of degrees of freedom, the specific heat even for 
the spring model with extremely stifi^, but not fixed, bond lengths a — > oo differs from the 
fixed-bond case by /cb/2 per bond. In Fig. 3, the curves for the fixed-bond case [18] are 
included for comparison (for the associated specific heat including the "ficitious" constant 
offset per monomer, [N — l)kB/'2N, compensating the frozen-bond constraint). 
For comparing thermodynamic quantities, we performed RE-MC simulations and respec- 
tive A-MD and 2NHC-MDs for SI with relatively stiff bonds, a = 50. The results for the 
specific heat and the fiuctuations of gyration radius and end-to-end distance are shown in 
Figs. 4(a)-(c), respectively. As can clearly be seen, RE-MC and A-MD results coincide 
for all quantities for a wide range of temperatures. This is obviously not the case for the 
2NHC-MDs data points which deviate for temperatures T > 0.4 seriously from the RE-MC 
results. The qualitative thermodynamic behavior, i.e., the occurrence of conformational 
transitions, is still identified in all cases and the peak temperatures are comparable, but 
the quantitative agreement for the fiuctuating quantities between REl-MC and 2NHC-MDs 
is very poor. 

In order to see whether the deviation is due to a too small length of the Nose-Hoover 
chain, we repeated the NHC-MD simulations with M — 3 and M — 4 thermostats. A 
noticeable change of the results was, however, not expected as no model-specific additional 
conserved quantities were identified. The simplicity of the model allowed for extremely 
long equilibration times and run lengths, as listed in Table I. Differences between the 
independent MD runs, worth to be discussed in detail, were not found. This is confirmed 
by the results shown in Fig. 5, where the output of the several tested NHC-MD variants 
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for the specific heat is plotted. In particular, increasing the run time of 2NHC-MDs by a 
factor of 20 (2NHC-MD1) does not improve the results noticeably. 

Another check we performed was to change the virtual masses. Prom the analysis of 
the dependence of the statistical results on the masses for the one-dimensional harmonic 
oscillator in Sec. 2.2.2, it is clear that a careful choice of the Qk values is required to obtain 
reliable results. The simulations leading to the 2NHC-MDs results shown in Figs. 4(a)- 
(c) and Fig. 5 were based on the relation (17) as suggested in Ref. [10]. In order to 
see the influence of the virtual masses on the results, we used in additional simulations 
different simple rescahngs of the values given by Eqs. (17) and (19), i.e., for our polymer 
with a = 50, m = 1, and N = 20, we chose as reference values Qi{T) = 0.6 T and 
Q2{T) — 0.01 T. The results for the specific-heat estimates are shown in Fig. 6, again 
compared with the replica-exchange MC values. Actually, the results depend on the virtual 
masses, but simple rescahng obviously does not solve the problem: The "best" choice for 
high temperatures, Qk — Q^/IOOO, produces wrong results in the intermediate temperature 
region. It seems that the temperature dependence of the optimal QkS is nontrivial and the 
assumption of a linear T-dependence in Eq. (17) is insufficient. If no data are available for 
comparison, the quality of the 2NHC-MDs results cannot be appraised. This is, of course, 
a substantial problem. 

In further tests, it also turned out that the 2NHC-MDs results depend on the initially cho- 
sen conformation, i.e., the initial condition for the Nose-Hoover dynamics was not forgotten 
throughout the run. In consequence, the thermodynamic equilibrium state space was not 
sampled reliably. This aspect is directly connected with the general MD heteropolymer 
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folding problem: Starting from an unfolded, denatured conformation, the native, folded 
state was rarely found. On the other hand, also unfolding from the initialized native 
conformation is slow, but the sampling in intermediate and high temperature regions is 
clearly improved. Prom the above results, it is not surprising that the strongest deviations, 
compared with the RE-MC results, are noticed close to the collapse transition, where an 
appropriate sampling of the collapsed and the random-coil phase is required. 
We have also repeated the simulations for a small polymer with anharmonic interactions 
among non-bonded monomers in order to find out whether systematic deviations of statis- 
tical quantities are also present in this much simpler system. The potential energy of this 
system, 

V(R) = ^;harm(R) + ^;anharm(R), (25) 

consists of the harmonic bond energy Vharm as already defined in Eq. (5) and an anharmonic 
potential Vanharm for the interaction between non-bonded monomers: 



N-l N 

^anharm = E E [^rim - ^D' + ^Tlm - ^D'] ■ (26) 
1=1 m=l+l 



In our simulations, the model was parametrized by setting a — 50, 71 = 10, and 72 = 1. 
The equilibrium distance between bonded and non-bonded monomers was bo — 6q°° = 1. 
The simulations were perfomed for a small homopolymer with N = 5 monomers (sequence 
A5). Again, statistics of parallel tempering MC simulations was compared to results from 
MD simulations with NHC thermostat. The particular parameters and run lengths were 
chosen according to the values given for RE-MC and 2NHC-MD1 in Table I. Two types 
of initial conformations were used for the MD simulations. In one set of simulations, 
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we started from random conformations at each temperature, while in the other case low- 
energy crystalline conformations were chosen, constructed from a tetrahedron with an 
additional monomer mirrored at one face. In Fig. 7, results for the specific heats of the 
anharmonic 5-mer, obtained with RE-MC and differently initialized 2NHC-MD1, are shown. 
Again, the results of the Nose- Hoover MD simulations deviate systematically from the MC 
output. As for the Lennard- Jones heteropolymer SI, the deviations become stronger at 
high temperatures, where the NHC coupling to the heat-bath is more relevant. 

3.3 Autocorrelation time analysis for MD and MC 

For long time intervals At, the autocorrelation function (18) decays exponentially, As{At) ~ 
exp(— At/rexp), where Texp is the exponential autocorrelation time. Therefore, the auto- 
correlation time is a measure for the decay rate of correlations in equilibrium and depends 
on the dynamics of the algorithm which is employed to generate the time series data. The 
statistical significance of a data set is connected with small temporal correlations. This 
means that in a time series with L entries only about ~ L/Te^p data points are un- 
correlated and determine the statistical error of the data. For this analysis, the origin of 
the time series, i.e., the inherent algorithmic time scale, is irrelevant and, therefore, the 
autocorrelation functions of time series obtained with different methods can be compared. 
In particular, the autocorrelation time is a good quantitative measure for the efficiency of 
algorithms in sampling the relevant state space. 

For two fixed temperatures, T — 0.25 and T — 1.0, we have compared the autocorrelation 
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functions and autocorrelation times of standard M-MC simulations, A-MD, and 2NHC- 
MDs for the Lennard- Jones heteropolymer SI. In the M-MC case, the time difference At 
is measured in MC sweeps, where, within a single sweep, the coordinates of all monomers 
are sequentially tried to be moved randomly. The conformational changes are accepted 
according to the Metropolis criterion, with the acceptance rate adjusted at around 50%. 
For the MD runs. At is measured in units of time steps 6t. 

Figure 8 shows the respective autocorrelation functions of the M-MC, A-MD, and 2NHC- 
MDs simulations for T — 0.25 and T — 1.0. In Table II, we have hsted the autocorrelation 
times for the two fixed temperatures. Similarly to the MC results, the autocorrelation 
time in the MD runs also decreases with temperature. The autocorrelation times differ 
noticeably for the three methods compared with each other. Not unexpectedly, autocorre- 
lations decay fastest for M-MC. The values of Texp, which are of the order O{10^ — 10^) MC 
sweeps respective MD time steps for the two temperatures considered, are much smaller 
than the run lengths of the simulations (see Table I). Thus, Ues ~ 10^ . . . 10^ data points 
are uncorrelated in all MC and MD runs. For this reason, the statistical error bars for all 
of our MC and MD results are very small. Thus, the partly large deviations in the results 
for energetic and structural fluctuations, in particular near the conformational transitions 
(see, e.g.. Fig. 4) and for high temperatures (as in Fig. 7), are not of statistical nature. 
Rather, we conclude that the system behaves non-ergodic, i.e., not all sections of the 
physical phase space being thermodynamically relevant at the given temperature T are 
covered by intersection points of the trajectory projected from the extended Nose-Hoover 
phase space. The almost constant part of the T — 1 2NHC-MD curve in Fig. 8 around 
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At — 10 000 ... 20 000 is also an indication that the system got stuck in a local free-energy 
minimum. 

4 Summary 

In this study, we have shown by explicit comparison with results from Monte Carlo simu- 
lations that even for a minimahstic model at mesoscopic length scales Nose-Hoover chain 
molecular dynamics simulations are not capable to reproduce the correct thermodynamic 
behavior of heteropolymers. Prom our analysis, we conclude that for the polymer systems 
investigated in our study, the proper stable thermodynamic equilibrium cannot be reached 
in molecular dynamics simulations with Nose- Hoover chain thermostats and results depend 
on the initialization of the systems. In consequence, the sampling of folding and unfold- 
ing events is insufficient. Although the results for low temperatures are comparable with 
replica-exchange Monte Carlo data, it should be noted that in the NHC-MD runs folding 
events were hardly observed. Therefore, the correct formation of the hydrophobic core 
towards the native fold did not happen and sampling at very low temperatures, i.e., in the 
hydrophobic-core dominated region, is only reasonable, if the MD run is initialized with 
the native state. For intermediate and high temperatures, we find a serious dependence 
of the results on the choice of the values for the virtual masses of the heat-bath coupling 
degrees of freedom. 

The exemplified heteropolymer used in our study possesses only 20 monomers and is thus 
comparatively small. Its folding characteristics is not particularly complex as the stiffness 
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of the virtual bonds has been relaxed to simplify the MD implementation. It should 
be noted, however, that substituting the Nose-Hoover thermostat against the Andersen 
thermostat with random coUisions significantly improves the MD results. 
In consequence, for statistical analyses of heteropolymers in a wide range of temperatures, 
the applicability of canonical constant-temperature molecular dynamics simulations with 
deterministic thermostats is rather limited. For realistic models, the complexity of the 
microscopic description at the atomic scale is known to extremely slow down NHC-MD 
folding simulations. Here, however, we used a much simpler coarse-grained model and 
folding events have also not been adequately recovered. 

It should be noted, however, that NHC-MD has proven to be quite successful in explaining 
dynamic processes at time scales much shorter than folding times, where, e.g., selected 
biological functions of proteins under physiological conditions can be studied. Interest- 
ing examples, where the application of NHC-MD methods proved very useful, are water 
penetration into a cell through the aquaporin membrane protein [27] and the ATP syn- 
thase, where the catalytic subunits of Fl, embedded into the membrane FO proton channel, 
partially act as rotating "molecular motor" that promotes dehydration of ADP and P to 
ATP [28]. Such studies require that the native folds of the proteins must be known as 
these are used as input. Substantial conformational changes of the proteins do not occur 
or are limited to small segments of a few amino acids. 
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List of figure captions 

Fig. 1: 

Relative errors |p^n /Pcan — 1| of the canonical position (solid line) and momentum (dashed 
line) distributions Pce,n{x) andpcan(p) for a one-dimensional harmonic oscillator, estimated 
in NHC-MD simulations with M = 2 for different choices of virtual masses Qi and Q2 at 

r = 5. 

Fig. 2: 

Frequency spectra of the velocity autocorrelation function Ay{u!) (upper, solid line) and of 
the bond- length autocorrelation Ar^^_^_^{u) (lower, dashed hue) for bond strength a = 50 
at T = 1. 

Fig. 3: 

RE-MC results for the heteropolymer SI: (a) specific heat per monomer, fiuctuations of (b) 
gyration radius and (c) end-to-end distance as functions of the temperature for different 
strengths a of the harmonic bonds. For comparison, also the results for a stiff polymer 
(fixed bond length) are shown [for Cy, the effect of the reduced number of degrees of 
freedom is artificially compensated by a constant offset {N — 1)/2N]. Jackknife error bars 
are also shown, but are very small. 
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Fig. 4: 

Comparison of results from RE-MC, 2NHC-MDs, and A-MD simulations with bond strength 
a = 50 for the heteropolymer SI: (a) specific heat per monomer, fluctuations of (b) gyra- 
tion radius and (c) end-to-end distance as functions of the temperature. Jackknife error 
bars are of symbol sizes or smaller. 

Fig. 5: 

Results for the specific heat as obtained with the variants 2NHC-MDs, 2NHC-MD1, 3NHC- 
MD, and 4NHC-MD. For comparison, the RE-MC curve is also shown. 

Fig. 6: 

Collapse transition region of the specific heat as obtained from 2NHC-MDs runs for differ- 
ent choices of the virtual masses Qk. The reference values Q — Qi^2 are given by Eq. (17) 
as suggested in Ref. [10]. 

Fig. 7: 

Specific heats obtained from RE-MC and 2NHC-MD1 runs with random and ordered start 
conformations for a 5-mer with anharmonic interaction (25), (26) between non-bonded 
monomers. 

Fig. 8: 

Autocorrelation functions from M-MC, A-MD, and 2NHC-MDs runs and fits ~ exp(-At/rcxp) 
at temperatures T — 0.25 and T — 1.0. 
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Table I: Methods and specifications used in this study. Equihbration times and run lengths 
are given in MC sweeps or MD steps, respectively. 



method 



label 



M equilibration run length 



Metropolis MC 

Replica- Exchange MC 

MD with Andersen thermostat 



M-MC 



RE-MC 



A-MD 



MD with Nose-Hoover chain thermostat 2NHC-MDs 2 



1 X 10^ 



1 X 10^ 



1 X 10^ 



1 X 10^ 



2NHC-MD1 2 1 X 10^ 



3NHC-MD 



1 X lO'^ 



1 X 10^ 



3 X 10^ 



3 X 10^ 



3 X 10^ 



6 X 10^ 



6 X lO'^ 



4NHC-MD 4 1 X 10^ 



6 X 10^ 
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Table II: Exponential autocorrelation times from M-MC, A-MD, and 2NHC-MDs runs at 
T = 0.25 and T = 1.0. 



''"exp 



M-MC 0.25 27 x 10^ 

A-MD 0.25 60 x 10^ 

2NHC-MDS 0.25 124 x 10=^ 

M-MC 1.0 1 X 10^ 

A-MD 1.0 18 X 10^ 

2NHC-MDS 1.0 7 X 10=^ 
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